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Anoxybacillus flavithermus subsp. yunnanensis is the only strictly thermophilic bacterium that 
is able to tolerate a broad range of toxic solvents at its optimal temperature of 55-60°C. The 
type strain E13^ was isolated from water-sediment slurries collected from a hot spring. This 
study presents the draft genome sequence of A. flavithermus subsp. yunnanensis El 3^ and its 
annotation. The 2,838, 393bp long genome (67 contigs) contains 3,035 protein-coding genes 
and 85 RNA genes, including 10 rRNA genes, and no plasmids. The genome information has 
been used to compare with the genomes from A. flavithermus subsp. flavithermus strains. 



Introduction 

Solvent-tolerant bacteria are a relatively new group 
of extremophilic microorganisms. They are able to 
overcome the toxic and destructive effects of or- 
ganic solvents due to their unique adaptive mecha- 
nisms. Most of the reported solvent- tolerant bacte- 
ria are mesophilic bacteria that have an optimal 
temperature of between 25-37°C [1]. So far, 
Anoxybacillus flavithermus subsp. yunnanensis is the 
only strictly thermophilic bacterial species known 
to tolerate a broad range of solvents at its optimal 
temperature of 55-60°C [2,3]. The strains show un- 
usual physiological features in the presence of sol- 
vents, such as a higher cell yield [2], an observable 
incrassation of electron-transparent intracellular 
material and a distorted cytoplasm [3]. However, 
mechanisms of solvent tolerance in thermophilic 
species have not been proposed. 

The type strain E13t (=CCTCC AB2010187t =KCTC 
13759T) and the additional strain PGDY12 were 
isolated from water-sediment slurries collected 
from a hot spring in Yunnan Province of China in 
our lab, and are most closely related to A. 
flavithermus subsp.flavithermus, first discovered in 
a hot spring in New Zealand [4]. At present, a total 
of 19 species and two subspecies of 
Anoxybacilluswi^ validly published names have 
been reported [5]. None of these 
Anoxybacillusstrains is reported to tolerate solvents 



except A. flavithermus subsp. yunnanensis. To un- 
derstand the molecular basis of the ability to toler- 
ate solvents under high temperature conditions, we 
sequenced and annotated a draft genome of the 
type strain E13'r of A. flavithermus subsp. 
yunnanensis. 

Classification and features 

A. flavithermus subsp. yunnanensis E13t [Table 1) 
was isolated in 2008 by static cultivation in rich 
Luria-Bertani (LB) medium supplemented with 
10% ethanol [2]. This strain is a facultatively aer- 
obic. Gram-positive, motile, spore-forming rod 
that is capable of utilizing a wide range of carbon 
sources, such as arabinose, cellobiose, galactose, 
maltose, trehalose and xylose. The strain E13t not 
only exhibited a remarkable ability to grow in eth- 
anol concentrations reaching 13% at 55°C, but can 
also tolerate highly toxic solvents including tolu- 
ene, benzene, xylene, chloroform and cyclohexane. 
Because A. flavithermus subsp. yunnanensis is the 
only strictly thermophilic bacterium that is able to 
tolerate toxic solvents, the effect of temperature 
on solvent tolerance has not yet been studied. The 
reports of the effect of temperature on ethanol (a 
much less toxic solvent) tolerance indicated that 
ethanol tolerance decreased with increasing tem- 
perature [20,21]. The comparison of the growth of 
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strain E13t at different temperatures showed that 
a temperature increase of 20°C, from 45 to 65°C, 
resulted in a decrease of the critical inhibitory tol- 
uene concentration from 0.56 to 0.31%. A similar 
sharp decrease occurred in the cases of benzene, 
xylene, chloroform and cyclohexane. The results 
suggested that temperature plays a vitally im- 
portant role in determining solvent tolerance in 
bacteria, which may explain why such 
thermophilic bacteria are rare in nature. 



Currently, more than 30 solvent-tolerant 
mesophilic bacteria have been reported, and 8 ge- 
nomes are available in GenBank. The phylogenetic 
position of A. flavithermus subsp. yunnanensis E13'^ 
among these typical solvent- tolerant bacteria is 
shown in Figure 1. This strain is most closely re- 
lated to Bacillusspecies. The genomes of B. cere- 
usstrain E33L and strain ATCC 10987 might pro- 
vide valuable guidance in a genetic analysis of the 
solvent tolerance of A. flavithermus subsp. 
yunnanensis E 13t. 



Table 1. Classification and general features of A. flavitliermus subsp. yunnanensis El 3^ according to the 
MIGS recommendations [6] 



MIGS ID Property 



Term 



Evidence code^ 



MIGS-6 

MIGS-6.3 

MIGS-22 

MIGS-15 

MiGS-14 

MIGS-4 

MIGS-5 

MIGS-4.1 

MiGS-4.2 

MIGS-4.3 

MIGS-4.4 



Current classification 



Gram stain 

Cell shape 

Motility 

Sporulation 

Temperature range 

Optimum temperature 

Carbon source 

Energy source 

Habitat 

Salinity 

Oxygen requirement 

Biotic relationship 

Pathogenicity 

Geographic location 

Sample collection time 

Latitude 

Longitude 

Depth 

Altitude 



Domain Bacteria TAS [7] 

Phylum Firmicutes TAS [8-10] 

Class eac/7// TAS [11,12] 

Orde r e a cilia les TAS [1 3, 1 4] 

Family Bac/y/aceae TAS [13,15] 

Genus Anoxybacillus TAS [16,17] 

Spec\es Anoxybacillus flavithermus TAS [16] 

Subspec\es Anoxybacillus flavitherni)us subsp. TAS [2,18] 
yunnanensis 

Type strain El 3^ TAS [2] 

positive TAS [2] 

rod TAS [2] 

motile TAS [2] 

sporulating TAS [2] 

30-66°C TAS [2] 

60°C TAS [2] 

carbohydrates TAS [2] 

heterotrophic TAS [2] 

hot spring TAS [2] 

optimum 0.3% (w/v) NaCI TAS [2] 

facultative anaerobe TAS [2] 

free-living TAS [2] 

non-pathogenic NAS 

Yunnan, China TAS [2] 

2008 IDA 

N4°56.5951' IDA 

W98°26.2 0 32' IDA 

water-sediment slurry (shallow) IDA 

1,457 m above sea level NAS 



^Evidence codes - IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author State- 
ment (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly 
observed for the living, isolated sample, but based on a generally accepted property for the species, or an- 
ecdotal evidence). These evidence codes are from the Gene Ontology project [19]. If the evidence code is 
IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowl- 
edgements. 
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66 

100 



71r Pseudomonas putida str. Idaho ( AGFJOOOOOOOO) 
100|L Pseudomonas putida S12 (ALNROOOOOOOO) 



99 



78 



100 



100 



99 



Pseudomonas putida DOT-T1E (NC_01 8220.1) 

— Pseudomonas fluorescens P21 (FJ605510) 

— Pseudomonas aeruginosa PT^2^ (EF515832) 
Pseudomonas citronellolis (Z76659) 

Acinetobacter radioresistens S^Ti (GUI 45275) 



100 



88 



Aeromonas tiydrophila (AB626121) 

Vibrio alginolyticus IBBCt2 (JN874640) 



100 



|- Proteus vulgaris (AB680019) 



100 



90 



Proteus mirabilis (AB6801 51 ) 

Burkholderia cepacia ATCC 17759 (AY741334) 

Aromatoleum aromaticum EbN1 (NR_074676) 

Stenotrophomonas maltophilia (AB1 80661) 



87 



100 1 Flavobacterium lutescens (AB680009) 

Sphingomonas aromaticivorans B0695 (U20755) 
— Rtiodococcus opacus 84 (NR_074632) 
Deinococcus geothermalis T27 (EU600161 ) 



100 



95 



100 1— Brevibacillus agri 13 (FJ715821) 

Brevibacillus brevisBEk^ (EF079071) 

Anoxybacillus flavithermus subsp. yunnanensis E13^(HM016869) 

100 1 — Bacillus licheniformis S-86 (AY017347) 



99 



68 



0.02 



Bacillus subtilis (AB679982) 
lOO r Bacillus cereus ATCC 10987 (AJ577290) 
Bacillus cereus E33L (CP000001.1) 

I — Staphylococcus sp. LMG-19417 (AJ276810) 
100 1 — Staphylococcus saprophyticus M36 (DQ462328) 



Gram-negative 



Gram-positive 



Figure 1. Phylogenetic tree highlighting the position of A. flavitliermus subsp. yunnanensis El 3^ relative to other 
typical solvent-tolerant bacteria. The 16S rRNA sequences were aligned using ClustalX2, and phylogenetic infer- 
ences obtained using the neighbor-joining method with the MEGA program. Species and GenBank accession 
numbers are indicated. Bootstrap values based on 1,000 replicates show the robustness of the branching. Scale 
bar represents 0.02 substitutions per nucleotide position. Strains with genome sequencing projects registered in 
GenBank are shown in bold. 



Genome sequencing information 

Genome project history 

The organism was selected based on its unique 
characteristics as a solvent-tolerant thermophile 
and in order to investigate new mechanisms of 
solvent tolerance. The genome was sequenced at 
BGI-Shenzhen (Shenzhen, China) and deposited in 
Genbank under the accession number 
AVGHOOOOOOOO. The version described in this pa- 
per is version AVGHOIOOOOOO. To our knowledge, 
it was the first genome of A. flavithermus subsp. 
yunnanensis, the 8th genome of an 
Anoxybacillusspecies and the 9* genome of sol- 
vent-tolerant bacteria to be sequenced. A sum- 
mary of the project information associated with 
MIGS version 2.0 compliance [6] is shown in Table 
2. 



Growth conditions and DNA isolation 

A. flavithermus subsp. yunnanensis strain E13t was 
grown in LB medium at 60°C for 8 h. The cells 
were harvested by centrifugation at 12,000 g, and 
washed twice with distilled water. Genomic DNA 
from the strain E13t was extracted with a Ge- 
nomic DNA Mini Preparation Kit (Beyotime, 
Shanghai, China) according to the method for ex- 
tracting genomic DNA from Gram-positive bacte- 
ria. The quality and concentration of the genomic 
DNA were measured by spectrophoto metric anal- 
ysis using a biophotometer [Eppendorf 
BioPhotometer Plus, Eppendorf, Germany). 
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Table 2. Project information 


MIGS ID 


Property 


Term 


MlGS-31 


Finishing quality 


High-quality Draft 


MILiJ-ZO 


Libraries used 


One 454 shotgun library and two paired-end lllumina Hiseq 
2000 libraries 


MIGS-29 


Sequencing platforms 


454 GS FLX Titanium, lllumina HiSeq 2000 sequencing platform 


MIGS-31 .2 


Fold coverage 


52.5x454 Titanium, 368.5xlllumina 


MIGS-30 


Assemblers 


Newbler version 2.6 


MIGS-32 


Gene calling method 


Glimmer 3.02 




Genbank ID 


AVGHOOOOOOOO 




Genbank Date of Release 


August 01, 2014 




GOLD ID 


Gi0037576 


MlGS-13 


Project relevance 


Strictly thermophilic and organic solvent-tolerant strain 



Genome sequencing and assembly 

The genome of A. flavithermus subsp. yunnanensis 
was sequenced using a combination of 454 GS FLX 
Titanium (Roche) with a shotgun Ubrary (1.8-kb 
insert size), and lllumina Hiseq2000 sequencing 
platform with two paired-end libraries (0.5 and 6- 
kb insert size). The 454 shotgun library was con- 
structed with 500 ng of DNA as described by the 
manufacturer with the GS Rapid library Prep kit 
(Roche), and the details of lllumina paired-end 
library construction and sequencing can be found 
at the lllumina web site. For the genome, we con- 
structed and sequenced a 454 shotgun library 
which generated 352,901 reads totaling 148.6 Mb, 
and 2 lllumina paired-end libraries which gener- 
ated 1,182 Mb raw data. The final assembly was 
based on 148.6 Mb of 454 draft data, which pro- 
vides an average 52.5 x coverage of the genome 
and 1,043 Mb of lllumina draft data, which pro- 
vides an average 368.5x coverage of the genome. 
These sequences were assembled using Newbler 
software with 90% identity and 40 bp as overlap. 
The resulting 67 contigs were scaffolded via read- 
pairing relationships with SSPACE [22] using all 
available libraries of high quality reads. The final 
assembly identified 67 contigs arranged in 24 
scaffolds and generated a genome size of 
2,838,393 bp. 

Genome annotation 

Genes were predicted by merging the results ob- 
tained from the RAST (Rapid Annotation using 



Subsystem Technology) server [23] and the 
Glimmer modeling software package [24]. The 
predicted coding sequences (CDSs) were translat- 
ed and used to search the National Center for Bio- 
technology Information (NCBI) nonredundant da- 
tabase, KEGG, Clusters of Orthologous Groups 
(COG), Swiss-Prot and TrEMBL databases. The 
tool RNAmmer [25] was used to find rRNA genes, 
whereas tRNA genes were found by using the tool 
tRNAscanSE [26]. Other non-coding RNAs were 
identified by searching the genome for Rfam pro- 
files using INFERNAL (v0.81) [27]. Signal peptides 
and numbers of transmembrane helices were pre- 
dicted using SignalP [28] and TMHMM [29], re- 
spectively. 

Genome properties 

The genome is 2,838,393 bp long (1 chromosome, 
no plasmids) with a 41.4% G+C content (Figure 2 
and Table 3). Of the 3,120 predicted genes, 3,035 
were protein-coding genes, and 85 were RNAs. In 
addition, ten rRNA genes (two 16S rRNA, one 23S 
rRNA and seven 5S rRNA) and 75 predicted tRNA 
genes were identified in the genome. A total of 
2,267 genes (72.66%) were assigned a putative 
function. The remaining genes were annotated as 
hypothetical proteins. The properties and the sta- 
tistics of the genome are summarized in Table 3. 
The distribution of genes into COGs and KEGG 
functional categories is presented in Table 4. 
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Figure 2. Graphical circular map of the chromosome. From the outside to the center: RNA 
genes (tRNA red, rRNAs purple and sRNA black) on the forward strand, genes on the forward 
strand (colored by COG categories), genes on the reverse strand, RNA genes on the reverse 
strand, G+C content, and GC skew (purple negative values, olive positive values). 



Table 3. Nucleotide content and gene count levels of the genome 



Attribute 


Value 


% of total' 


Genome size (bp) 


2,838,393 


100 


DNA G+C content (bp) 


1,176,2 30 


41.44 


DNA Coding region (bp) 


2,555,544 


90.03 


Total genes 


3,120 


100 


RNA genes 


85 


2.72 


Protein-coding genes 


3,035 


97.28 


Genes with protein function prediction 


2,267 


72.66 


Genes assigns to KEGG pathways 


1,936 


62.05 


Genes assigned to KEGG Orthology 


1,012 


32.43 


Genes assigned to COGs 


1,886 


60.44 


Genes with signal peptides 


99 


3.17 


Genes with transmembrane helices 


716 


22.94 



"The total is based on either the size of the genome in base pairs or the total number 
of protein coding genes in the annotated genome 
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Table 4. Number of genes associated with the 25 general COG functional categories 
Code Value %age° Description 



J 


148 


4.88 


Translation 


A 


0 


0.00 


RNA processing and modification 


K 


131 


4.32 


Transcription 


L 


153 


5.04 


Replication, recombination and repair 


B 


1 


0.03 


chromatin structure and dynamics 


D 


27 


0.89 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


17 


0.56 


Defense mechanisms 


T 


93 


3.06 


Signal transduction mechanisms 


M 


85 


2.80 


Cell wall/membrane biogenesis 


N 


52 


1.71 


Cell motility 


Z 


0 


0.00 


Cytoskeleton 


w 


0 


0.00 


Extracellular structures 


u 


34 


1.12 


Intracellulartrafficking and secretion 


o 


91 


2.99 


Posttranslational modification, protein turnover, chaperones 


c 


137 


4.51 


Energy production and conversion 


G 


144 


4.74 


Carbohydrate transport and metabolism 


E 


200 


6.59 


Amino acid transport and metabolism 


F 


63 


2.08 


Nucleotide transport and metabolism 


H 


103 


3.39 


Coenzyme transport and metabolism 


1 


79 


2.60 


Lipid transport and metabolism 


P 


122 


4.02 


Inorganic ion transport and metabolism 


Q 


32 


1.05 


Secondary metabolites biosynthesis, transport and catabolism 


R 


231 


7.61 


General function prediction only 


S 


169 


5.57 


Function unknown 




1,234 


40.66 


Nol in COGs 



a) The total is based on the total number of protein coding genes in the annotated genome. 
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Comparison with other Anoxybacillus 
flavithermus genomes 

As of this moment, six genome sequences from 
Anoxybacillusspecies are available in GenBank da- 
tabase, including four A. flavithermus 
subsp.flavithermus strains, one A. 
kamchatkensisstrain and one Anoxybacillussp. 
strain. Only A. flavithermus subsp.flavithermus 
strain WKl and strain TNO-09.006 have complete 
genome sequences [30,31]. Here we compare the 
genome sequence of A. flavithermus subsp. 
yunnanensis E13t with those of the four A. 
flavithermus subsp.flavithermus strains. The draft 
genome of A. flavithermus subsp. yunnanensis E13t 
is similar in size to that of A. flavithermus 
subsp.flavithermus strain WKl (2.83 vs 2.84 Mb, 
respectively), but larger than that of strain TNO- 
09.006, strain AKl and strain NBRC 109594 (2.65, 
2.63 and 2.77 Mb, respectively). The G+C content 



of A. flavithermus subsp. yunnanensis £13"^ is simi- 
lar to those of A. flavithermus subsp.flavithermus 
strain WKl, strain TNO-09.006 and strain NBRC 
109594 (41.4, 41.7, 41.8 and 41.7%, respectively), 
but slightly less than that of strain AKl (42.7%). 
The gene content of A. flavithermus subsp. 
yunnanensis E13t is greater than those of A. 
flavithermus subsp.flavithermus strain WKl, strain 
TNO-09.006, strain AKl and strain NBRC 109594 
(3,120, 2,954, 2,819, 2,799 and 2,963 genes, re- 
spectively). In addition, A. flavithermus subsp. 
yunnanensis E13t shared a mean genome se- 
quence similarity of 90% (range 80-99%), 90% 
(79-100%), 86% (73-99%) and 91% (71-100%) 
with A. flavithermus subsp.flavithermus strain 
WKl, strain TNO-09.006, strain AKl and strain 
NBRC 109594, respectively. 
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